Simple plotting


In [1]:
import pysal as ps
import pandas as pd
import numpy as np

This notebook will cover simple plotting. So that we can visualize plots within the notebook, we first must "turn on" the notebook plotting capabilities.

Commands in a Jupyter notebook that start with % or %% are known as magics, and are essentially directions to the Jupyter kernel itself. Usually the commands do not execute in Python and are, in fact, not actually part of Python. The command to enable inline plotting in a notebook is %matplotlib inline. Another magic, %matplotlib notebook, provides some additional tools which we will cover.

The standard python plotting library matplotlib, has a special submodule, pyplot, that is used to provide an environment for plotting functions. So, we will import matplotlib.pyplot directly. This is commonly done in plotting code.


In [2]:
import matplotlib.pyplot as plt
%matplotlib inline

Before we do any plotting, let's read in some of the data that we have used before, the historical per-capita income data for US States:


In [3]:
path = ps.examples.get_path('usjoin.csv')
#remember, this is a csv, so you should use pandas.read_csv to get a dataframe
data = pd.read_csv(path, index_col='STATE_FIPS') 
W = ps.queen_from_shapefile(ps.examples.get_path('us48.shp'), idVariable='STATE_FIPS')

In [4]:
data.head()


Out[4]:
Name 1929 1930 1931 1932 1933 1934 1935 1936 1937 ... 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
STATE_FIPS
1 Alabama 323 267 224 162 166 211 217 251 267 ... 23471 24467 25161 26065 27665 29097 30634 31988 32819 32274
4 Arizona 600 520 429 321 308 362 416 462 504 ... 25578 26232 26469 27106 28753 30671 32552 33470 33445 32077
5 Arkansas 310 228 215 157 157 187 207 247 256 ... 22257 23532 23929 25074 26465 27512 29041 31070 31800 31493
6 California 991 887 749 580 546 603 660 771 795 ... 32275 32750 32900 33801 35663 37463 40169 41943 42377 40902
8 Colorado 634 578 471 354 353 368 444 542 532 ... 32949 34228 33963 34092 35543 37388 39662 41165 41719 40093

5 rows × 82 columns

Pandas provides two simple and fast plotting attributes, hist and plot.

hist will plot a histogram of data, and can be called either on the entire dataframe or on individual series/columns:

Pandas histograms

Pandas has a histogram function for any column or table. These are configured to be easy to use, and typically can pass arbitrary options down to the underlying matplotlib.hist function:


In [5]:
data['1929'].hist()


Out[5]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fb3f4405350>

In [6]:
data['1929'].hist(color='black', alpha=.4, orientation='horizontal', bins=10)


Out[6]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fb3f414fc90>

Again, this histogram function is not as detailed as the matplotlib histogram function which we'll show in a second, but one very useful option is the by for the pandas column/table histogram, since it allows you to quickly construct histograms by group.

For instance, let's introduce a dummy variable denoting whether or not a state is in the US south:


In [7]:
south_dummy = [[u'Alabama', 1],
       [u'Arizona', 0],
       [u'Arkansas', 1],
       [u'California', 0],
       [u'Colorado', 0],
       [u'Connecticut', 0],
       [u'Delaware', 1],
       [u'District of Columbia', 1],
       [u'Florida', 1],
       [u'Georgia', 1],
       [u'Idaho', 0],
       [u'Illinois', 0],
       [u'Indiana', 0],
       [u'Iowa', 0],
       [u'Kansas', 0],
       [u'Kentucky', 1],
       [u'Louisiana', 1],
       [u'Maine', 0],
       [u'Maryland', 1],
       [u'Massachusetts', 0],
       [u'Michigan', 0],
       [u'Minnesota', 0],
       [u'Mississippi', 1],
       [u'Missouri', 0],
       [u'Montana', 0],
       [u'Nebraska', 0],
       [u'Nevada', 0],
       [u'New Hampshire', 0],
       [u'New Jersey', 0],
       [u'New Mexico', 0],
       [u'New York', 0],
       [u'North Carolina', 1],
       [u'North Dakota', 0],
       [u'Ohio', 0],
       [u'Oklahoma', 1],
       [u'Oregon', 0],
       [u'Pennsylvania', 0],
       [u'Rhode Island', 0],
       [u'South Carolina', 1],
       [u'South Dakota', 0],
       [u'Tennessee', 1],
       [u'Texas', 1],
       [u'Utah', 0],
       [u'Vermont', 0],
       [u'Virginia', 1],
       [u'Washington', 0],
       [u'West Virginia', 1],
       [u'Wisconsin', 0],
       [u'Wyoming', 0]]

In [8]:
south_dummy = pd.DataFrame(south_dummy, columns=['NAME', 'SOUTH'])

Now, we can merge this with our existing data using the merge method. This operates like any standard table join:


In [9]:
data = data.merge(south_dummy, left_on='Name', right_on='NAME')

Now, we can quickly make plots of the per capita income distribution, split up by the dummy variable:


In [10]:
data['1990'].hist(by=data.SOUTH)


Out[10]:
array([<matplotlib.axes._subplots.AxesSubplot object at 0x7fb3f3fa0310>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7fb3f3fd9c10>], dtype=object)

Using pyplot

Pyplot, a submodule of matplotlib, is the main driver for most plotting code in Python. It has a few basic commands that we will use to make statistical plots.

Most importantly, though, the matplotlib gallery provides a good reference for different commonly-encountered plotting problems.

First, though, we will cover basic line and point plotting.


In [11]:
plt.plot([4,2,1,3])


Out[11]:
[<matplotlib.lines.Line2D at 0x7fb3f3cda110>]

First, note that when plot is passed a single list or array, $(i, y_i)$ is plotted, where $i$ is the position of an element in the list.

When we pass two lists, matplotlib interprets the first list as the x-coordinates and the second as a list of the y-coordinates.


In [12]:
plt.plot([4,2,1,3], [0,5,2,1])


Out[12]:
[<matplotlib.lines.Line2D at 0x7fb3f3c23450>]

There are a few ways to plot many different lines at once. The easiest way to do this is to run multiple plot commands before calling plt.show(). This adds each line generated from plt.plot to the same figure, which is then shown when plt.show() is called.


In [13]:
plt.plot([4,2,1,3], [0,5,2,1])
plt.plot([5,2,1,3], [0,4,1,0])
plt.show()


Noting this, you can use many of the various types of customization functions in matplotlib before plt.show() and they will be applied to the current figure:


In [14]:
plt.plot([4,2,1,3], [0,5,2,1])
plt.plot([5,2,1,3], [0,4,1,0])
plt.title('Trajectories')
plt.ylabel('$\\theta$', fontsize=20)
plt.xlabel('x')
plt.show()


One very powerful aspect of matplotlib is that it plots each row of an array as a new line as well. To show how this is powerful, let's make a plot of the per capita income of all states over time.

Since we have income data from 1929 to 2009, and typing all of those columns would be tedious, let's do it using Python:


In [15]:
columns = [str(year) for year in range(1929, 2010) ]

In [16]:
columns


Out[16]:
['1929',
 '1930',
 '1931',
 '1932',
 '1933',
 '1934',
 '1935',
 '1936',
 '1937',
 '1938',
 '1939',
 '1940',
 '1941',
 '1942',
 '1943',
 '1944',
 '1945',
 '1946',
 '1947',
 '1948',
 '1949',
 '1950',
 '1951',
 '1952',
 '1953',
 '1954',
 '1955',
 '1956',
 '1957',
 '1958',
 '1959',
 '1960',
 '1961',
 '1962',
 '1963',
 '1964',
 '1965',
 '1966',
 '1967',
 '1968',
 '1969',
 '1970',
 '1971',
 '1972',
 '1973',
 '1974',
 '1975',
 '1976',
 '1977',
 '1978',
 '1979',
 '1980',
 '1981',
 '1982',
 '1983',
 '1984',
 '1985',
 '1986',
 '1987',
 '1988',
 '1989',
 '1990',
 '1991',
 '1992',
 '1993',
 '1994',
 '1995',
 '1996',
 '1997',
 '1998',
 '1999',
 '2000',
 '2001',
 '2002',
 '2003',
 '2004',
 '2005',
 '2006',
 '2007',
 '2008',
 '2009']

Now, we can use these to grab each year's data from our dataframe:


In [17]:
data[columns]


Out[17]:
1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 ... 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
0 323 267 224 162 166 211 217 251 267 244 ... 23471 24467 25161 26065 27665 29097 30634 31988 32819 32274
1 600 520 429 321 308 362 416 462 504 478 ... 25578 26232 26469 27106 28753 30671 32552 33470 33445 32077
2 310 228 215 157 157 187 207 247 256 231 ... 22257 23532 23929 25074 26465 27512 29041 31070 31800 31493
3 991 887 749 580 546 603 660 771 795 771 ... 32275 32750 32900 33801 35663 37463 40169 41943 42377 40902
4 634 578 471 354 353 368 444 542 532 506 ... 32949 34228 33963 34092 35543 37388 39662 41165 41719 40093
5 1024 921 801 620 583 653 706 806 860 769 ... 40640 42279 42021 42398 45009 47022 51133 53930 54528 52736
6 1032 857 775 590 564 645 701 868 949 795 ... 31255 32664 33463 34123 35998 37297 39358 40251 40698 40135
7 518 470 398 319 288 348 376 450 487 460 ... 28145 28852 29499 30277 32462 34460 36934 37781 37808 36565
8 347 307 256 200 204 244 268 302 313 290 ... 27940 28596 28660 29060 29995 31498 32739 33895 34127 33086
9 507 503 374 274 227 403 399 475 423 426 ... 24180 25124 25485 25912 27846 29003 30954 32168 32322 30987
10 948 807 671 486 437 505 573 650 731 648 ... 32259 32808 33325 34205 35599 36825 39220 41238 42049 40933
11 607 514 438 310 294 359 421 481 547 472 ... 27011 27590 28059 29089 30126 30768 32305 33151 33978 33174
12 581 510 400 297 253 269 425 393 523 458 ... 26723 27315 28232 28835 31027 31656 33177 35008 36726 35983
13 532 467 401 266 250 287 362 387 428 383 ... 27816 28979 29067 30109 31181 32367 34934 36546 37983 37036
14 393 325 291 211 205 233 265 294 341 297 ... 24294 24816 25297 25777 26891 27881 29392 30443 31302 31250
15 414 355 318 241 227 265 290 330 353 348 ... 23334 25116 25683 26434 27776 29785 33438 34986 35730 35151
16 601 576 491 377 371 416 430 506 510 471 ... 25623 27068 27731 28727 30201 30721 32340 33620 34906 35268
17 768 712 638 512 466 523 548 618 665 633 ... 33872 35430 36293 37309 39651 41555 43990 45827 47040 47159
18 906 836 759 613 559 609 643 714 732 672 ... 37992 39247 39238 39869 41792 43520 46893 49361 50607 49590
19 790 657 540 394 347 453 530 619 685 572 ... 29612 30196 30410 31446 31890 32516 33452 34441 35215 34280
20 599 552 457 363 308 358 451 472 540 494 ... 32101 32835 33553 34744 36505 37400 39367 41059 42299 40920
21 286 202 175 127 131 174 177 229 224 201 ... 20993 22222 22540 23365 24501 26120 27276 28772 29591 29318
22 621 561 491 365 334 367 420 466 508 475 ... 27445 28156 28771 29702 30847 31644 33354 34558 35775 35106
23 592 501 382 339 298 364 476 475 512 517 ... 22569 24342 24699 25963 27517 28987 30942 32625 33293 32699
24 596 521 413 307 275 259 409 396 415 405 ... 27829 29098 29499 31262 32371 33395 34753 36880 38128 37057
25 868 833 652 550 495 546 658 843 762 780 ... 30529 30718 30849 32182 34757 37555 38652 40326 40332 38009
26 686 647 558 427 416 476 498 537 565 533 ... 33332 33940 34335 34892 36758 37536 39997 41720 42461 41882
27 918 847 736 587 523 573 625 709 747 697 ... 36983 37959 38240 38768 40603 42142 45668 48172 49233 48123
28 410 334 289 208 211 247 292 343 362 338 ... 22203 24193 24446 25128 26606 28180 29778 31320 32585 32197
29 1152 1035 881 676 626 680 722 808 838 789 ... 34547 35371 35332 36077 38312 40592 43892 47514 48692 46844
30 332 292 248 187 208 253 271 297 324 295 ... 27194 27650 27726 28208 29769 31209 32692 33966 34340 33564
31 382 311 187 176 146 180 272 234 326 282 ... 25068 26118 26770 29109 29676 31644 32856 35882 39009 38672
32 771 661 563 400 385 455 516 593 648 561 ... 28400 28966 29522 30345 31240 32097 33643 34814 35521 35018
33 455 368 301 216 222 252 298 321 376 346 ... 23517 25059 25059 25719 27516 29122 31753 32781 34378 33708
34 668 607 505 379 358 439 458 548 556 531 ... 28350 28866 29387 30172 31217 32108 34212 35279 35899 35210
35 772 712 600 449 417 482 517 601 636 563 ... 29539 30085 30840 31709 33069 34131 36375 38003 39008 38827
36 874 788 711 575 559 600 645 711 731 672 ... 29685 31378 32374 33690 35318 36461 38610 40421 41542 41283
37 271 243 205 159 175 211 229 258 273 250 ... 24321 24871 25279 25875 27057 28337 29990 30958 31510 30835
38 426 366 241 189 129 184 309 244 323 320 ... 26115 27531 27727 30072 31765 32726 33320 35998 38188 36499
39 378 325 277 198 204 245 264 304 334 300 ... 26239 27059 27647 28501 29734 30764 32314 33578 34243 33512
40 479 412 348 266 257 294 326 372 418 404 ... 27871 28519 28295 28929 30392 32448 34489 36020 36969 35674
41 551 498 369 305 298 310 389 463 444 444 ... 23907 24899 25010 25192 26169 27905 29582 31009 31253 30107
42 634 576 474 365 338 383 414 471 485 457 ... 26901 28140 28651 29609 31240 31920 34394 36018 36940 36752
43 434 384 370 284 285 320 350 390 423 390 ... 31162 32747 33235 34451 36285 38304 40644 42506 43409 43211
44 741 658 534 402 376 443 490 569 599 582 ... 31528 32053 32206 32934 34984 35738 38477 40782 41588 40619
45 460 408 356 257 259 313 337 390 418 370 ... 21915 23333 24103 24626 25484 26374 28379 29769 31265 31843
46 673 588 469 362 333 380 461 518 551 507 ... 28232 29161 29838 30657 31703 32625 34535 35839 36594 35676
47 675 585 476 374 371 411 496 551 607 561 ... 27230 29122 29828 31544 33721 36683 41548 43453 45177 42504

48 rows × 81 columns

Now, plot will iterate over an array or a list of lists passed to it, and interpret each row of the array as its own line. So, if we wanted to plot the changes in per-capita income over time, we would need to plot each column above, not each row.

Fortunately, we can simply transpose the values matrix and get what we need:


In [18]:
plt.plot(columns, data[columns].values.T)
plt.title('Raw Per Capita Income, 1929-2009')
plt.xlabel('Year')
plt.ylabel('Constant dollars per person')
plt.show()


If we wanted to normalize the data, we could do this in an array-wise fashion.


In [19]:
centered_pci = data[columns].values - data[columns].values.mean(axis=0)
normalized_pci = centered_pci / centered_pci.var(axis=0)**.5

In [20]:
plt.plot(columns, normalized_pci.T)
plt.title('Normalized Per Capita Income change, 1929-2009')
plt.xlabel('Year')
plt.ylabel('Dollars/person normalized by year')
plt.show()


Again, since matplotlib interprets the first list as x-coordinates and the second as y-coordinates, we can make scatterplots very quickly.

For instance, we can make a Moran scatterplot, a common spatial dependence diagnostic plot, from this data. First, we need to grab the last year's income data:


In [21]:
last = data['2009'].values

Then, we can use the pysal.lag_spatial function, along with our row-standardized weights, to construct the spatial lag of the 2009 per capita income:


In [22]:
W.transform = 'r'

In [23]:
Wlast = ps.lag_spatial(W, last)

Since this is a scatterplot, we don't want to use the default drawing behavior, which connects all points plotted with a line.

Matplotlib has two interfaces to change line parameters. The first uses a string to specify different plotting parameters, like color and marker. The string can contain a color and a marker style in any order.

For example, the following string, '.k' plots each point using small black dots:


In [24]:
plt.plot(last, Wlast, '.k')


Out[24]:
[<matplotlib.lines.Line2D at 0x7fb3f3b2bb10>]

For larger dots, you can use the o marker:


In [25]:
plt.plot(last, Wlast, 'ok')


Out[25]:
[<matplotlib.lines.Line2D at 0x7fb3f37403d0>]

To add vertical, horizontal, or sloped lines lines to the plot, you use various different line plotting functions.

For a simple scatter plot with line of best fit, you'll need to estimate a regression on the data and use the slope and intercept from that. We can do this very quickly using PySAL:


In [26]:
reg = ps.spreg.OLS(Wlast.reshape(-1,1), last.reshape(-1,1))
#a,b = np.polyfit(last, Wlast, 1) #will also work

In [27]:
a,b = reg.betas

Finally, to put it all together, we will draw on the X and Y axes as vertical and horizontal lines through the X and Y means, and will draw the line of best fit. In addition, we can annotate the plot with text, and will add the Moran's $I$, the slope of the line of best fit, to the plot:


In [28]:
plt.plot(last, Wlast, 'ok')
 # dashed vert at mean of the last year's PCI
plt.vlines(last.mean(), Wlast.min(), Wlast.max(), linestyle='--')
 # dashed horizontal at mean of lagged PCI
plt.hlines(Wlast.mean(), last.min(), last.max(), linestyle='--')
# red line of best fit
plt.plot(last, a + b*last, 'r')
plt.text(s='$I = %.3f$' %b, x=39000, y=41000, fontsize=18)
plt.title('Moran Scatterplot')
plt.ylabel('Spatial Lag of PCI')
plt.xlabel('2009 PCI')


Out[28]:
<matplotlib.text.Text at 0x7fb3f1e5a3d0>

Seaborn

In addition to the most-commonly used plotting library in Python, matplotlib, it is good to know about the seaborn library, dedicated to making simple statistical plots.

Like matplotlib, seaborn has a deep gallery with many different types of visualizations. Seaborn's focus is on quick but pretty statistical visualizations, so it comes with many more specialized plot types than the distribution plots shown above.

In a few ways, it'll help us make better plots, and can especially help when reusing plots in various contexts.

There are too many different dedicated statistical plot types in seaborn to cover now, but we'll show how seaborn affects standard plotting in matplotlib, as well as show some of the more useful pre-baked plot types in seaborn.


In [29]:
import seaborn as sns


/home/ljw/.local/lib/python2.7/site-packages/matplotlib/__init__.py:872: UserWarning: axes.color_cycle is deprecated and replaced with axes.prop_cycle; please use the latter.
  warnings.warn(self.msg_depr % (key, alt_key))

First, let's re-plot the moran scatterplot from above:


In [30]:
plt.plot(last, Wlast, 'ok')
 # dashed vert at mean of the last year's PCI
plt.vlines(last.mean(), Wlast.min(), Wlast.max(), linestyle='--')
 # dashed horizontal at mean of lagged PCI
plt.hlines(Wlast.mean(), last.min(), last.max(), linestyle='--')
# red line of best fit
plt.plot(last, a + b*last, 'r')
plt.text(s='$I = %.3f$' %b, x=39000, y=41000, fontsize=18)
plt.title('Moran Scatterplot')
plt.ylabel('Spatial Lag of PCI')
plt.xlabel('2009 PCI')


Out[30]:
<matplotlib.text.Text at 0x7fb3f0eb1e50>

Note that it looks quite different. seaborn, when imported, changes a few of the basic graphical parameters of matplotlib.

Once imported, the set_context function can be used to scale the content of a graph up or down, depending on the context in which it might be used.

For instance, let's look at the difference between the article context:


In [31]:
sns.set_context('paper')

plt.plot(last, Wlast, 'ok')
 # dashed vert at mean of the last year's PCI
plt.vlines(last.mean(), Wlast.min(), Wlast.max(), linestyle='--')
 # dashed horizontal at mean of lagged PCI
plt.hlines(Wlast.mean(), last.min(), last.max(), linestyle='--')
# red line of best fit
plt.plot(last, a + b*last, 'r')
plt.text(s='$I = %.3f$' %b, x=39000, y=41000, fontsize=18)
plt.title('Moran Scatterplot')
plt.ylabel('Spatial Lag of PCI')
plt.xlabel('2009 PCI')


Out[31]:
<matplotlib.text.Text at 0x7fb3f0eac550>

and the talk context:


In [32]:
sns.set_context('talk')

plt.plot(last, Wlast, 'ok')
 # dashed vert at mean of the last year's PCI
plt.vlines(last.mean(), Wlast.min(), Wlast.max(), linestyle='--')
 # dashed horizontal at mean of lagged PCI
plt.hlines(Wlast.mean(), last.min(), last.max(), linestyle='--')
# red line of best fit
plt.plot(last, a + b*last, 'r')
plt.text(s='$I = %.3f$' %b, x=39000, y=41000, fontsize=18)
plt.title('Moran Scatterplot')
plt.ylabel('Spatial Lag of PCI')
plt.xlabel('2009 PCI')


Out[32]:
<matplotlib.text.Text at 0x7fb3f3bac210>

text and shaping gets rescaled quite significantly.

But, seaborn can generate a ton of other plots as well. The most useful of these tends to be kdeplot and distplot.

kernel density plots

by default, kernel density plots are a common and powerful plotting technique. Seaborn makes it easy to do KDE plots in one and two dimensions:


In [33]:
sns.kdeplot(last)
plt.title('One-Dimensional KDE plot')
plt.xlabel('PCI')


Out[33]:
<matplotlib.text.Text at 0x7fb3f3c2b2d0>

In [34]:
sns.kdeplot(last, Wlast)
plt.title('Two-Dimensional KDE plot')
plt.ylabel('Lag PCI')
plt.xlabel('PCI')


Out[34]:
<matplotlib.text.Text at 0x7fb3f3b0bbd0>

Instead of doing a two-dimensional plot, if you would like to plot two distributions in one frame, use two separate calls to sns.kdeplot, followed by a plt.show() call.

In addition, when using more than one line, you can label each line being plotted and use a legend by providing a label argument to each line plotting function, and then calling plt.legend().


In [35]:
plt.title('Kernel Densities of 2009 PCI')
sns.kdeplot(last, label='2009 PCI')
sns.kdeplot(Wlast, label='Lagged 2009 PCI')
plt.legend()
plt.show()


Like always, there are many more options, so consult the documentation for more information about the flexibility of the kdeplot function.


In [36]:
sns.kdeplot?

Distplot

In many cases, we would like to visualize how well our data fits to a specific distributional form. We can do this with seaborn via distplot.

By default, distplot will fit a histogram under a kernel density plot:


In [37]:
sns.distplot(last)


Out[37]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fb3f0d6fa50>

But, when supplied a distribution from scipy, the standard scientific python library, it will fit the histogram to the given distribution by estimating the proper parameters:


In [38]:
import scipy.stats as stats

In [39]:
sns.distplot(last, fit=stats.norm, kde=False)


Out[39]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fb3f3955b50>

In [40]:
sns.distplot(last, fit=stats.gamma, kde=False)


Out[40]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fb3f0b6f550>

Heatmap

Heatmaps are useful to visualize the structure of sparse matrices.

Here, we can visualize our spatial weights matrix for the 48 contiguous US States:


In [41]:
sns.heatmap(W.full()[0])
plt.xticks([])
plt.yticks([])
plt.show()


LMPlot


In [80]:
columbus = ps.pdio.read_files(ps.examples.get_path('columbus.shp'))
Wco = ps.queen_from_shapefile(ps.examples.get_path('columbus.shp'))
columbus['downtown'] = columbus.DISCBD < columbus.DISCBD.describe()['25%']

In [76]:
sns.lmplot('INC', 'HOVAL',columbus, hue='downtown')


Out[76]:
<seaborn.axisgrid.FacetGrid at 0x7fb3eac5f1d0>

Pairgrid


In [92]:
Wco.transform = 'r'
columbus['lag_HOVAL'] = ps.lag_spatial(Wco,columbus['HOVAL'].values)
sns.pairplot(columbus, kind='reg', vars=['HOVAL', 'lag_HOVAL', 'CRIME'], diag_kind='kde')


Out[92]:
<seaborn.axisgrid.PairGrid at 0x7fb3e0ac2710>